CLIR-Based Collaborative Construction of a Multilingual Terminological Dictionary for Cultural Resources
نویسنده
چکیده
We will describe ongoing work in developing a collaborative environment to construct a CLIRbased multilingual terminological dictionary dedicated to the Digital Silk Road project and web site launched and managed by NII (National institute of Informatics, Japan-Tokyo). A considerable amount of cultural resources has been digitized, including 95 rare books written in 10 different languages. In order to make them searchable and accessible easily by the visitors of the site, themselves multilingual as well, a cross lingual information retrieval system is being built. As these books are very rich in specialized terms, an important part of that endeavour is to gather these terms in many languages in a terminologicial dictionary (a database of terms contianing some information potentially usable to later build a real terminological database). For that purpose, we use a participative approach, where visitors of the online archive are the main source of the terms used in the languages they know, while multilingual online resources are used to initialize the term base through a process that depends on the archived textual data. 1 The first, the third, and the fourth authors work at Grenoble Informatics Laboratory, GETALP, Université Joseph Fourier (Grenoble, France). 2 The second author works for the National Institute of Informatics (Tokyo, Japan).
منابع مشابه
Improved Cross-language Information Retrieval via Disambiguation and Vocabulary Discovery
Cross-lingual information retrieval (CLIR) allows people to find documents irrespective of the language used in the query or document. This thesis is concerned with the development of techniques to improve the effectiveness of Chinese–English CLIR. In Chinese–English CLIR, the accuracy of dictionary-based query translation is limited by two major factors: translation ambiguity and the presence ...
متن کاملTowards Web Mining of Query Translations for Cross-Language Information Retrieval in Digital Libraries
This paper proposes an efficient client-server-based query translation approach to allowing more feasible implementation of cross-language information retrieval (CLIR) services in digital library (DL) systems. A centralized query translation server is constructed to process the translation requests of cross-lingual queries from connected DL systems. To extract translations not covered by standa...
متن کاملDbnary: Wiktionary as a LMF based Multilingual RDF network
Contributive resources, such as wikipedia, have proved to be valuable in Natural Language Processing or Multilingual Information Retrieval applications. This article focusses on Wiktionary, the dictionary part of the collaborative resources sponsored by the Wikimedia
متن کاملBuilding Specialized Multilingual Lexical Graphs Using Community Resources
We are describing methods for compiling domain-dedicated multilingual terminological data from various resources. We focus on collecting data from online community users as a main source, therefore, our approach depends on acquiring contributions from volunteers (explicit approach), and it depends on analyzing users’ behaviors to extract interesting patterns and facts (implicit approach). As a ...
متن کاملDictionary-based CLIR for the CLEF Multilingual Track
This report describes the work done for our participation in the multilingual track of the CrossLanguage Evaluation Forum (CLEF). We use a dictionary-based approach to translate English queries into German, French and Italian queries. We then apply a term disambiguation technique to select the best translation terms from the terms found in the dictionary entries, and a query expansion technique...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010